4,877 research outputs found
Building XML data warehouse based on frequent patterns in user queries
[Abstract]: With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users’ needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources
A Novel Euler's Elastica based Segmentation Approach for Noisy Images via using the Progressive Hedging Algorithm
Euler's Elastica based unsupervised segmentation models have strong
capability of completing the missing boundaries for existing objects in a clean
image, but they are not working well for noisy images. This paper aims to
establish a Euler's Elastica based approach that properly deals with random
noises to improve the segmentation performance for noisy images. We solve the
corresponding optimization problem via using the progressive hedging algorithm
(PHA) with a step length suggested by the alternating direction method of
multipliers (ADMM). Technically, all the simplified convex versions of the
subproblems derived from the major framework of PHA can be obtained by using
the curvature weighted approach and the convex relaxation method. Then an
alternating optimization strategy is applied with the merits of using some
powerful accelerating techniques including the fast Fourier transform (FFT) and
generalized soft threshold formulas. Extensive experiments have been conducted
on both synthetic and real images, which validated some significant gains of
the proposed segmentation models and demonstrated the advantages of the
developed algorithm
Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning
In many real-world tasks, the concerned objects can be represented as a
multi-instance bag associated with a candidate label set, which consists of one
ground-truth label and several false positive labels. Multi-instance
partial-label learning (MIPL) is a learning paradigm to deal with such tasks
and has achieved favorable performances. Existing MIPL approach follows the
instance-space paradigm by assigning augmented candidate label sets of bags to
each instance and aggregating bag-level labels from instance-level labels.
However, this scheme may be suboptimal as global bag-level information is
ignored and the predicted labels of bags are sensitive to predictions of
negative instances. In this paper, we study an alternative scheme where a
multi-instance bag is embedded into a single vector representation.
Accordingly, an intuitive algorithm named DEMIPL, i.e., Disambiguated attention
Embedding for Multi-Instance Partial-Label learning, is proposed. DEMIPL
employs a disambiguation attention mechanism to aggregate a multi-instance bag
into a single vector representation, followed by a momentum-based
disambiguation strategy to identify the ground-truth label from the candidate
label set. Furthermore, we introduce a real-world MIPL dataset for colorectal
cancer classification. Experimental results on benchmark and real-world
datasets validate the superiority of DEMIPL against the compared MIPL and
partial-label learning approaches.Comment: Accepted at NeurIPS 202
Transformer-based Multi-Instance Learning for Weakly Supervised Object Detection
Weakly Supervised Object Detection (WSOD) enables the training of object
detection models using only image-level annotations. State-of-the-art WSOD
detectors commonly rely on multi-instance learning (MIL) as the backbone of
their detectors and assume that the bounding box proposals of an image are
independent of each other. However, since such approaches only utilize the
highest score proposal and discard the potentially useful information from
other proposals, their independent MIL backbone often limits models to salient
parts of an object or causes them to detect only one object per class. To solve
the above problems, we propose a novel backbone for WSOD based on our tailored
Vision Transformer named Weakly Supervised Transformer Detection Network
(WSTDN). Our algorithm is not only the first to demonstrate that self-attention
modules that consider inter-instance relationships are effective backbones for
WSOD, but also we introduce a novel bounding box mining method (BBM) integrated
with a memory transfer refinement (MTR) procedure to utilize the instance
dependencies for facilitating instance refinements. Experimental results on
PASCAL VOC2007 and VOC2012 benchmarks demonstrate the effectiveness of our
proposed WSTDN and modified instance refinement modules
Weakly supervised POS tagging without disambiguation
Weakly supervised part-of-speech (POS) tagging is to learn to predict the POS tag for a given word in context by making use of partial annotated data instead of the fully tagged corpora. Weakly supervised POS tagging would benefit various natural language processing applications in such languages where tagged corpora are mostly unavailable.
In this article, we propose a novel framework for weakly supervised POS tagging based on a dictionary of words with their possible POS tags. In the constrained error-correcting output codes (ECOC)-based approach, a unique L-bit vector is assigned to each POS tag. The set of bitvectors is referred to as a coding matrix with value { 1, -1}. Each column of the coding matrix specifies a dichotomy over the tag space to learn a binary classifier. For each binary classifier, its training data is generated in the following way: each pair of words and its possible POS tags are considered as a positive training example only if the whole set of its possible tags falls into the positive dichotomy specified by the column coding and similarly for negative training examples. Given a word in context, its POS tag is predicted by concatenating the predictive outputs of the L binary classifiers and choosing the tag with the closest distance according to some measure. By incorporating the ECOC strategy, the set of all possible tags for each word is treated as an entirety without the need of performing disambiguation. Moreover, instead of manual feature engineering employed in most previous POS tagging approaches, features for training and testing in the proposed framework are automatically generated using neural language modeling. The proposed framework has been evaluated on three corpora for English, Italian, and Malagasy POS tagging, achieving accuracies of 93.21%, 90.9%, and 84.5% individually, which shows a significant improvement compared to the state-of-the-art approaches
Hypoxia-inducible factor 1-mediated regulation of PPP1R3C promotes glycogen accumulation in human MCF-7 cells under hypoxia
AbstractHundreds of genes can be regulated by hypoxia-inducible factor 1 (HIF1) under hypoxia. Here we demonstrated a HIF1-mediated induction of protein phosphatase 1, regulatory subunit 3C gene (PPP1R3C) in human MCF7 cells under hypoxia. By mutation analysis we confirmed the presence of a functional hypoxia response element that is located 229bp upstream from the PPP1R3C gene. PPP1R3C induction correlates with a significant glycogen accumulation in MCF7 cells under hypoxia. Knockdown of either HIF1α or PPP1R3C attenuated hypoxia-induced glycogen accumulation significantly. Knockdown of HIF2α reduced hypoxia-induced glycogen accumulation slightly (but not significantly). Our results demonstrated that HIF1 promotes glycogen accumulation through regulating PPP1R3C expression under hypoxia, which revealed a novel metabolic adaptation of cells to hypoxia
- …